

The Problem 


• ISS (and analog) research sometimes results in 
low confidence with regard to inferences 
about the general astronaut population 

— Small-/? 

— Non-random samples 
— Mission constraints on data acquisition 
— Lack of control over some data acquisition 

- Experimental confounds/competing studies 

- ...just to name a few 
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Sample Size Issues 



• Justifying new research with small-n 

• Magnified pragmatic concerns (ex. missing 
data, attrition, protocol constraints) 

• Analytic diagnostics testing prior to hypothesis 
tests 

• Limited statistical options for hypothesis tests 


...How can we get more from small-n studies? 


Mainstream Scientific Approach 


Collect data from a well-designed study 

Test hypotheses about the population of 
interest 

- Typically focused on the gverage effect for 
outcomes that are continuously scaled, or a 
difference in probabilities if the outcome is 
categorical. 

Results sections of manuscripts emphasize 
differences between groups and averages. 



"Average" Men and Women 


Average female (left) and male (right) composite faces, made from 
64 individual female and male images each. 



"The attractiveness ratings of the transformed faces depend on the number of original 
faces that have been used to create them. The more original images were used to create 
the composite, the more attractive it was rated, (r = 0.57 ** for female faces, r = 0.64 ** 
for male faces)... Average faces are attractive..." 


http://tinyurl.com/xr7j 




Attractive female faces: 


Unattractive female faces: 



Attractive male faces: 



Unattractive male faces: 



The Individual vs. the Average 


Doc treat patients... not averages 

NASA sends astronauts... not composites of 
available candidates 

When you make personal decisions, you probably 
consider the consequences to YOURSELF... not 
the average Joe or Jane Doe. 

How can scientists who emphasize groups and 
averages move towards individualized 
knowledge? 



How Can Science Become more 
Clinical/lndividualized? 


• Continue to apply our current methods 

— This talk is, by no means, an argument against the 
scientific method! 

• Augment our current methods (analytics, 
reports) in ways that help the reader 
understand the potential consequences to a 
hypothetical future individuals... 


Using Data-Driven Simulations to 
Augment Traditional Analyses 

• Perform your usual cadre of statistical tests of 
hypotheses for manuscripts, etc. 

• Consider augmenting your sample data with 
other relevant data if available 

V Consult your discipline knowledge & literatures to 
improve your theory & assumptions 

• Consider the most likely distribution of your 
outcome variable(s) (ex. Gaussian) 

• Calculate summary statistics (ex. mean, sd) from 
your sample 




Simulations (cont.) 


• Simulate future samples given the summary 
statists from your data, and the assumptions that 
you made about the outcome 

— Ex. Draw a sample from a normal distribution with 
mean = (i and standard deviation = o 

• Repeat the simulation several hundred times 

• Graph the simulated data, along with any 
relevant clinical, operational, or scientifically 
meaningful reference values 

• Calculate the probability of a future individual 
falling above/below the relevant reference values 


Example: Hip BMD 


Shapiro-Wilk W test for normal data 

Variable | Obs W V z Prob>z 

+ 

TotalHip_B~l | 51 0.99122 0.419 -1.855 

0.96822 



.6 .8 1 1.2 1.4 

TotHipPostBMD 



This is just a placeholder slide... will discuss why I chose this example... 

T-scores are normally distributed, widely accepted bone measures, and NASA has 
A standard for them. 


Post 6 BMD T-Scores (Total Hip) 


Real Data 



♦ ISS Observations (US & IP's) Mean 95% Cl of the Mean 

Historical data from n=51 US & IP Long Duration Astronauts 
|j=-0.60, ct=.79 




BMDT-Scores (Total Hip) 


Hypothetical Pilot Study 
Under a New Situation 



O Observations Mean 95% Cl of the Mean 

hypothetical pilot data showing a reduction in the mean relative to historical data 
p=-.73, a=1 .57 




l^SR^ Run Monte Carlo Simulations 

• Let's stay consistent with the assumption that 
t-scores are normally distributed 

• Let's assume that the population has 
variability similar to our larger historical data 
from n=51 subjects. 

• Let's assume that the mean for this new 
situation is lower by about 20% of the range 
of historical 6mo data. 

— Informed by our pilot data and the literature 


One Simulated Sample of 
BMD T-Scores 


One Simulated Dataset 
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Simulated n=10 Astronauts 


O Simulated Observation 


95% Cl of the Simulated Mean Post-Flight 


Another Simulated Sample of 
BMD T-Scores 
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Another Simulated Sample of 
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Another Simulated Dataset... 
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Simulated n=10 Astronauts 


O Simulated Observation 


95% Cl of the Simulated Mean Post-Flight 


Several Simulated Datasets. 




Simulated BMD T-Scores 


After 1000 Samples 

1000 Simulations of n=10 Astronauts where the Mean BMD t-score 
is Lower than Historical data by 20% of the Historical Range 

3- 


2 - 
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Simulated sample number representing n=10 observations in each 

Simulated t-score ♦ Mean of Simulated t-scores I 1 95% Cl of the Simulated Mean 

Each column of hollow diamonds along the X-axis represents a simulated sample of n=10. 

The probability for any single astronaut to fall under the cut-off = 0.014. 


What have we gained? 


• A visual representation of the individual risk of 
falling bellow a pre-established, clinically relevant 
threshold. 

- P(fail) = 0.014 

• A reminder that our single-sample pilot data is 
just that— a random sample from the larger 
population. 

• An extension of our data, informed also by 
historical data and the literature, enabling a 
discussion of next steps. 


ii^SR^ What did it cost? 

• Zero cost for additional subjects 

• Zero up mass for spaceflight 

• Zero competition with other studies 

• ...Zero new data! 

• Time & expertise for reflection following pilot 
study to contemplate simulation parameters 

• Time & expertise for conducting the Monte Carlo 
simulations. 

— Minimal software requirements... potentially free 



Limitations of MC Simulations 


They are never as good as real data & cannot 
stand alone 

- Useful as an augmentation tool 

Simulation parameters and assumptions can be 
tenuous & vulnerable to competing theories 

When pilot studies are small, their contributions 
to the MC Simulations can be questionable 

— Thus my recommendation to use the literature to help 
guide model assumptions & parameters 


Next.. 


Dr. Feiveson is going to take this idea one step 
further, and discuss the Bayesian approach to 
statistical analysis. 

— Can Bayes help with small-/?? 


